这项工作介绍了使用伪层作为费米子决定因素的随机估计量的费米子晶状体理论中基于流动采样的量规均值架构。这是最先进的晶格场理论计算中的默认方法,这使得对流向模型在QCD等理论的实际应用至关重要。还概述了通过标准技术(例如/奇数预处理和HasenBusch分解)来改进基于流的采样方法的方法。提供了二维U(1)和SU(3)具有$ n_f = 2 $ FERMIONS的量规理论的数值演示。
translated by 谷歌翻译
我们提出了连续重复的退火流传输蒙特卡洛(CRAFT),该方法结合了顺序的蒙特卡洛(SMC)采样器(本身是退火重要性采样的概括)与使用归一化流量的变异推断。直接训练了归一化的流量,可用于使用KL差异进行每个过渡,以在退火温度之间运输。使用归一化流/SMC近似值估算了此优化目标。我们从概念上展示并使用多个经验示例,这些示例可以改善退火流运输蒙特卡洛(Arbel等,2021),并在其上建造,也可以在基于马尔可夫链蒙特卡洛(MCMC)基于基于的随机归一化流(Wu等人。2020)。通过将工艺纳入粒子MCMC中,我们表明,这种学识渊博的采样器可以在具有挑战性的晶格场理论示例中获得令人印象深刻的准确结果。
translated by 谷歌翻译
基于标准化流的算法是由于有希望的机器学习方法,以便以可以使渐近精确的方式采样复杂的概率分布。在格子场理论的背景下,原则上的研究已经证明了这种方法对标量理论,衡量理论和统计系统的有效性。这项工作开发了能够使用动力学蜕皮的基于流动的理论采样的方法,这对于应用于粒子物理标准模型和许多冷凝物系的晶格场理论研究是必要的。作为一种实践演示,这些方法应用于通过Yukawa相互作用耦合到标量场的无大量交错的费米子的二维理论的现场配置的采样。
translated by 谷歌翻译
Much of the information of breathing is contained within the photoplethysmography (PPG) signal, through changes in venous blood flow, heart rate and stroke volume. We aim to leverage this fact, by employing a novel deep learning framework which is a based on a repurposed convolutional autoencoder. Our model aims to encode all of the relevant respiratory information contained within photoplethysmography waveform, and decode it into a waveform that is similar to a gold standard respiratory reference. The model is employed on two photoplethysmography data sets, namely Capnobase and BIDMC. We show that the model is capable of producing respiratory waveforms that approach the gold standard, while in turn producing state of the art respiratory rate estimates. We also show that when it comes to capturing more advanced respiratory waveform characteristics such as duty cycle, our model is for the most part unsuccessful. A suggested reason for this, in light of a previous study on in-ear PPG, is that the respiratory variations in finger-PPG are far weaker compared with other recording locations. Importantly, our model can perform these waveform estimates in a fraction of a millisecond, giving it the capacity to produce over 6 hours of respiratory waveforms in a single second. Moreover, we attempt to interpret the behaviour of the kernel weights within the model, showing that in part our model intuitively selects different breathing frequencies. The model proposed in this work could help to improve the usefulness of consumer PPG-based wearables for medical applications, where detailed respiratory information is required.
translated by 谷歌翻译
基于BERT的微调模型在内存,计算和时间上是资源密集的。尽管许多先前的工作旨在通过压缩技术(例如修剪)提高推论效率,但这些作品并未明确解决培训对下游任务的计算挑战。我们介绍了学习者模块和启动,新颖的方法,以利用预训练的语言模型的过度参数化,以获得收敛速度和资源利用率的好处。学习者模块通过微调参数的微调来导航1)有效训练的双结合,以及2)通过确保快速收敛和高度度量得分有效训练。我们在Distilbert上的结果表明,学习者在与基础方面的表现或超过基线。学习者训练7倍的参数比胶水上的最新方法少。在可乐方面,学习者快速调整20%,并且资源利用率显着降低。
translated by 谷歌翻译
It is common practice in deep learning to represent a measurement of the world on a discrete grid, e.g. a 2D grid of pixels. However, the underlying signal represented by these measurements is often continuous, e.g. the scene depicted in an image. A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output the appropriate measurement value for any input spatial location. In this paper, we take this idea to its next level: what would it take to perform deep learning on these functions instead, treating them as data? In this context we refer to the data as functa, and propose a framework for deep learning on functa. This view presents a number of challenges around efficient conversion from data to functa, compact representation of functa, and effectively solving downstream tasks on functa. We outline a recipe to overcome these challenges and apply it to a wide range of data modalities including images, 3D shapes, neural radiance fields (NeRF) and data on manifolds. We demonstrate that this approach has various compelling properties across data modalities, in particular on the canonical tasks of generative modeling, data imputation, novel view synthesis and classification. Code: https://github.com/deepmind/functa
translated by 谷歌翻译
Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective transformations. There has been much recent work on normalizing flows, ranging from improving their expressive power to expanding their application. We believe the field has now matured and is in need of a unified perspective. In this review, we attempt to provide such a perspective by describing flows through the lens of probabilistic modeling and inference. We place special emphasis on the fundamental principles of flow design, and discuss foundational topics such as expressive power and computational trade-offs. We also broaden the conceptual framing of flows by relating them to more general probability transformations. Lastly, we summarize the use of flows for tasks such as generative modeling, approximate inference, and supervised learning.
translated by 谷歌翻译
Reasoning about objects, relations, and physics is central to human intelligence, and a key goal of artificial intelligence. Here we introduce the interaction network, a model which can reason about how objects in complex systems interact, supporting dynamical predictions, as well as inferences about the abstract properties of the system. Our model takes graphs as input, performs object-and relation-centric reasoning in a way that is analogous to a simulation, and is implemented using deep neural networks. We evaluate its ability to reason about several challenging physical domains: n-body problems, rigid-body collision, and non-rigid dynamics. Our results show it can be trained to accurately simulate the physical trajectories of dozens of objects over thousands of time steps, estimate abstract quantities such as energy, and generalize automatically to systems with different numbers and configurations of objects and relations. Our interaction network implementation is the first general-purpose, learnable physics engine, and a powerful general framework for reasoning about object and relations in a wide variety of complex real-world domains.
translated by 谷歌翻译
The choice of approximate posterior distribution is one of the core problems in variational inference. Most applications of variational inference employ simple families of posterior approximations in order to allow for efficient inference, focusing on mean-field or other simple structured approximations. This restriction has a significant impact on the quality of inferences made using variational methods. We introduce a new approach for specifying flexible, arbitrarily complex and scalable approximate posterior distributions. Our approximations are distributions constructed through a normalizing flow, whereby a simple initial density is transformed into a more complex one by applying a sequence of invertible transformations until a desired level of complexity is attained. We use this view of normalizing flows to develop categories of finite and infinitesimal flows and provide a unified view of approaches for constructing rich posterior approximations. We demonstrate that the theoretical advantages of having posteriors that better match the true posterior, combined with the scalability of amortized variational approaches, provides a clear improvement in performance and applicability of variational inference.
translated by 谷歌翻译
This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation. DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images. The system substantially improves on the state of the art for generative models on MNIST, and, when trained on the Street View House Numbers dataset, it generates images that cannot be distinguished from real data with the naked eye.
translated by 谷歌翻译